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Novel DNA molecules and hosts 

The Invention concerns novel DNA molecules encoding a modified, endoplasmic rellculum-localed "dibas- 
ic processing endoprotease" and the use of said endoplasmic reliculunvlocated "dibasic processing endopro- 
5 tease" for the correct processing of heterologous polypeptides In bansformed hosts. 

Background of the Invention . 

The production of pharmaceulicaily applicable or enzymatlcally active proteins is a key area in the rapkily 
10 developing biotechnology Industry. Since the beginning of the era of recombinant DNAtechnology a great nunv 
ber of valuable heterologous proteins have been produced In and secreted from eukaryotic host cells which 
had been transformed with suitable expression vectors containing DNA sequences coding for said proteins. 
One of the major problems with the productfon of secreted proteins In eukaryotic expression systems Is to avoW 
malfolded biologically Inactive product 
16 It is now generally accepted that proteins destined for secretion from eukaryotic cells are translocated to 

the endoplasmic reticulum (ER) due to the presence of a signal sequence which is deaved off by the enzyme 
signal peptidase located In the rough ER membrane. The protein Is then transported from the.ER to the Golgi 
and via Golgi derived secretory vesicles to the cell surface (S. Pfeffer and J, Rothman. Ann, Rev. Blochem. 
56:829-52, 1987). Another major step in the production of correctly processed and correctly folded proteins 
20 Is the conversion of proprotelns to the mature forms in the Golgi apparatus and secretory vesicles. The cleav- 
age of the proprotein occurs at a so-called dibasic site, I.e. a motif consisting of at least two basic amino acWs. 
The processing is catalysed by enzymes located In the Golgl-apparatus, the so-called "dibasic processing en- 
doproteases". 

There are different "dibasic processing endoproteases" known which are involved In the processing of pro- 
26 tein precursors, for example the mammalian proteases furin, PC2, PCI and PC3, and the product of the yeast 

YAP3 gene and yeast yscF (also named KEX2 gene product; herein referred to as KEX2p). 

KEX2p is Involved In the maturation of the yeast mating pheromone a-factor (J. Kurjan and I. Hershkowltz. 

Cell 30:933-943, 1982). The a-factor is produced as a 165 amino acid precursor which Is processed during 

the transport to the cell surface. In the first step, the 1 9-amlno acid signal sequence (pre-sequence) is cleaved 
30 off by the signal peptidase. Then the precursor Is glycosylated and moves to the Golgi where a 66-amino ackf 

pro-sequence Is cut off by KEX2p. The a-factor pre-pro-sequence Is also known as a-factor "leader" sequence. 

A second protease In the Golgi apparatus. I.e. the KEX1 gene product, is responsible for the final maturation 

of the protein, 

KEX2p is encoded by the KEX2 gene and consists of a N-termlnal catalytic domain, a Ser/Thr rich domain, 
3S a membrane-spanning domain and a C-terminal tall responsible for Golgi localization. Mutant KEX2p enzyme 
lacking -200 C-termlnal. amino acids. Including the Ser/Thr rich domain, the membrane spanning domain and 
the C-termlnal tail, still retains KEX2p protease function, viz, cleavage at the C-terminal side of a pair of basks 
amino acids, such as Lys-Arg orArg-Arg [Fuller et a!., 1989, Proc. Natl. Acad. Scl. 86, 1434-1438; Fuller etal./ 
1989. Science 246. 482-4851. 
40 Leader sequences such as the yeast a-factor leader sequence are widely used for the production of se- 

creted heterologous prptelns in eukaryotic cells. In many cases, however, great difficulties are encountered 
because considerable amounts of biologically Inactive proteins are produced due to malfolding and aggregation 
of the proteins, especially.in the case of low molecular weight proteins. 

45 Object of the Invention 

Surprisingly, It has been found that a higher ratio of biologically active correctly folded heterologous protein 
to inactive malfolded protein Is produced in the host cell if the host cell has a "dibasic processing endoprotease" 
activity In the endoplasmic reticulum (ER). 

SO Thus, it Is an object of the invention to provide a method for the preparation of heterologous biologically 

active protein comprising the use of a host cell having a "dibasic processing endoprotease" in the ER. Other 
objects are the provision of a host cell having a "dibasic processing endoprotease" variant which Is located In 
the ER due to the transformation with a gene encoding the "dibasic processing endoprotease" variant, further 
the provision of a DNA molecule comprising such a gene, and the provision of methods for the preparation of 

55 such a DNA molecule and of such a host cell. 
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Detailed description of the Invention 

Process for the preparation of heterologous protein . 

5 The invention concerns a process for the preparation of heterologous biologically active protein liberated 

{n the host cell from a proprotein, said process comprising the use of a host cell having a "dibasic processing 

endoprotease" activity in the ER. 

A "dibasic processing endoprotease" activity within the meaning of the present invention Is the acth^lly of 

an endoprotease specific for a motif of two basic amino acids, e.g. Arg-Arg, Arg-Lys, Lys-Arg or Lys-Lys» which 
10 . endoprotease is naturally located in the Golgi apparatus and is naturally involved in the processing of propro- 

telns or polyproteins. 

The term "dibasic processing endoprotease" includes eukaryotic enzymes such as of mammalian origin, 
e.g. furin. PC2, PC1, PC3 (Barr, Cell 66: 1-3. 1991). and preferentially enzymes derived from yeast, such as 
the YAPS endoprotease [Egel-Mitani et al.. Yeast 6:127-137(1 990)] and. most preferentially the S. cerevisiae 
15 endoprotease KEX2p. 

The biologically active variants of the "dibasic processing endoprotease" of the invention are not restricted 
to the Golgi apparatus but are located In the ER due to the presence of an ER retention signal, i.e. a structure 
which Is suitable for the retention of a "dibasic processing endoprotease" in the ER. The naturally occurring 
"dibasic processing endoproteases" are attached to the membrane of the GoIgl apparatus or secretory vesicles 
20 due to a membrane anchor, i.e. a hydrophobic mambrane-spanning sequence. The ER retention signals are 
to be linked to the Oterminus of the protein, i.e. the "dibasic processing endoprotease" in order to locate the 
protease in the ER. Such a fusion protein consists of a protease and an ER retention signal lis hereinafter called 
"ER-located dibasic processing endoprotease". 

In a preferred embodiment of the invention the ER retention signal is attached to the G-terminus of a solu- 
25 ble form of a "dibasic processing endoprotease". i.e. a variant of a "dibasic processing endoprotease" which 
Is not attached to a cell membrane. Such a soluble form lacks the hydrophobic membrane spanning sequence 
but still retains the typical enzymatic "dibasic processing" function. 

A preferred example of a soluble "dibasic processing endoprotease" useful in the present invention is a 
soluble S. cerevisiae KEX2p, i.e. a KEX2p variant lacking the hyrophobic membrane-spanning sequence lo- 
30 cated in the region Tyr«79to Met«»Ithe amino acid sequence of the 814-residue S. cerevisiae KEX2p is known 
from K. Mizuno et al., . 

Biochem. Biophys. Res. Gommun. 156, 246-254 (1988)]. In particular, In a soluble KEX2p endoprotease 
according to the invention, the rnembrane binding site has selectively been removed. Hence the C-terminus 
starting with, for example, amino acid 700 (Lys) is still present, or the whole C-termlrius Including the mem- 

35 brane binding site, i.e. 136 to approximately 200 amino acids from the C-terminus, has been removed. Such 
soluble KEX2p proteins are described, for example, in EP 327,377 or in R.S. Fuller et al., Proc. Natl. Acad. 
Scl. USA 86, 1434-1438 (1 989). The most preferred soluble "dibasic processing endoprotease" of the Invention 
is the soluble KEX2p having the sequence depicted In the sequence listing under SEQ ID No. 1 and Is here- 
inafter referred to as KEX2p,. 

40 An ER-retention signal Is a structure determining the location of a polypeptide In the ER. The location In 

the ER may be based on a specific attachment to the ER membrane or preferentially on the prevention of the 
transport of a soluble protein into the Golgi apparatus by retransportation of the polypeptide from a compart- 
ment between the Golgi apparatus and the ER into the ER lumen. ER retention signals used preferentially in 
the present invention are of the latter type, I.e. such preventing the transport of soluble protein to the Golgi 

45 apparatus. 

A preferred example of such an ER retention signal is the so-called KDEL sequence (SEQ ID No. 3) func- 
tional In mammalian cells. More preferred is the DDEL sequence (SEQ ID No. 4) functional in the veast Kluv- 
veromyces lactls and most preferred Is the HDEL sequence (SEQ ID No. 2) functional in S. cerevisiae and In 
K- lactis. 

so Preferred forms of the "ER-Iocated dibasic processing endoprotease" comprise the ER-retentlon signal 

KDEL sequence attached to a "dibasic processing endoprotease" of a mammalian cell such as, for example, 
furin, PCI, PC2 or PC3 (P.J. Barr, supra), or preferably to a soluble variant thereof, or also to a S, cerevisiae 
KEX2p. which latter Is known to be functional in mammalian cells, or preferably to a soluble variant thereof. If 
such an "ER-Iocated dibasic processing endoprotease", e.g. furinKDEL, PC1KDEL, PC2KDEL, PC3KPEL, or 

55 KEX2pKDEL enzymes, are produced In a mammalian host cell transformed with . a gene for the expression of 
a heterologous protein, a higher proportion of correctly folded, secreted heterologous protein is produced. 

More preferably, the DDEL retention signal is fused to a K. iactis KEX2p analog or preferably to a soluble 
variant thereof, or to a S. cerevisiae KEX2p or preferably to a soluble variant thereof. In particular to KEX2pg. 
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S. cerevisiae KEX2p is functional also in jC iactis. Such a KEX2pDDEL produced in a K. lactis host cell allows 
the expression of a higher proportion of correctly folded, secreted heterologous protein. 

Most preferably, the HDEL retention signal Is fused to a S. cerevisiae KEX2p, or preferably to a soluble 
variant thereof. In particular to KEX2pa. Such a KEX2pHDEL protein produced in a KJactis or, more preferably, 
6 in a S. cerevisiae host cell allows the expression of a higher proportion of correctly folded, secreted heterolo- 
gous protein. 

In order to produce a host cell in which an ^'ER-iocated dibasic processing endoprotease" is produced, the 
host cell must be transformed with an expression cassette encoding an "ER-located dibasic processing endo- 
protease". The host cell which is transformed may still contain an intact endogeneous gene for the endogene- 
10 ous dibasic processing endoprotease on the chromosome, i.e. in the case of the S. cerevisiae system the host 
cell which Is to be transformed with KEX2pMDEL may be a KEX2^ cell, e.g. strain AB11 0. However, gene coding 
for the corresponding endogeneous dibasic processing endoprotease may also be destroyed, i.e. In the case 
of the S^ cerevisiae system the host ceil may be a kex2~ cell, e.g. strain AB110 kex2' '. 

For the transformation of a host cell hybrid vectors are used which provide for replication and expression 
IS of an expression cassette encoding the "ER-located dibasic processing endoprotease". These hybrid vectors 
may be extrachromosomally maintained vectors or also vectors which are Integrated into the host genome so 
that a cell Is produced which Is stably transformed with a said expression cassette. Suitable extrachromoso- 
mally maintained vectors and also vectors integrating in to the host genome the transformation of mammalian 
cells or of yeast cells are well known In the art. 
20 The hybrid vectors may be derived from any vector useful in the art of genetic engineering, such as from 

viruses, plasmlds or chromosomal DNA, such as derivatives of SV40, Herpes-viruses, Papilloma viruses. Ret- 
roviruses, Baculovirus, or derivatives of yeast plasmlds, e.g. yeast 2^ plasmid. 

Several possible vector systems are available for Integration and expression of the cloned DNA of the in- 
vention. In principle, all vectors which replicate and/or express a desired polypeptide gene comprised in an 
25 expression cassette of the Invention in the chosen host are suitable. The vector is selected depending on the 
host cells envisaged for transformation. Such host cells are preferably mammalian cells (if a "dibasic processing 
endoprotease" functional In mammalian cells Is used) or, more preferably, yeast cells (If a "dibasic processing 
endoprotease" functional in yeast cells is used). In principle, the extrachromosomally maintained hybrid vectors 
of the invention comprise the expression cassette for the expression of an ER-:located "dibasic processing en- 
30 doprotease", and an origin of replication or an autonomously replicating sequence. 

An origin of replication or an autonomously replicating sequence (a DNA element which confers autono- 
mously replicating capabilities to extrachromosomal elements) is provided either by construction of the vector 
to include an exogeneous origin such as, in the case of the mammalian vector, derived from Simian virus (SV 
40) or another viral source, or by the host cell chromosomal mechanisms. 
35 A hybrid vector of the invention may contain selective markers depending on the host which Is to be trans- 

formed, selected and doned. Any marker gene can be used which facilitates the selection of transformants 
due to the phenotyplc expression of the marker. Suitable markers are particularly those expressing antibiotic 
resistance, e.g. against tetracycline or ampicillin, or genes which complement host lesions. It is also possible . 
to employ as markers structural genes which are associated with an autonomously replicating segment pro- 
40 viding that the host to be trarisformed is auxotrophic for the product expressed by the marker. 

Preferred vectors suitable for the preparation of hybrid vectors pf the invention, i.e. comprising an expres- 
. sion cassette for the preparation of an ER>located "dibasic processing endoprotease' are those which are suit- 
able for replication and expression in S. cerevisiae and contain a yeast-replication origin and a.selective genetic 
marker for yeast Hybrid vectors that contain a yeast replication origin, for example the chromosomal auton^ 
45 omously replicating segment (ARS), are retained extrachromosomally within the yeast cell after transformation 
and are replicated autonomously during mitosis. Also, hybrid vectors that contain sequences homologous to 
the yeast 2\i plasmid DNA or that contain ARS and a sequence of a chronrK)somal centromere, for example 
CEN4, can be used. Preferred are the 2^ based plasmlds containing the complete or partial S. cerevisiae 2 
H plasmid sequence. Suitable marker genes for yeast are especially those that Impart antibiotic resistance to 
50 the host or, in the case of auxotrophic yeast mutants, genes that complement the host lesions. Corresponding 
• genes impart, for example, resistance to the antibiotic cycioheximide or provide for prototrophy in an auxotro- 
phic yeast mutant, for example the URA3, LEU2. HIS3 or the TRP1 gene. 

Preferably, hybrid vectors furthermore contain a replication origin and a marker gene for a bacterial host, 
especially E. coN, so that the construction and the cloning of the hybrid vectors and their precursors can be 
55 carried out in E. coll . 

In a most preferred embodiment of the invention a kex2- strain of S. cerevisiae is transformed eithe^r with 
an extrachromosomally maintained plasmid or with integration plasmid comprising an expression cassette for 
the expression of a soluble KEX2pHDEL. 
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An "expression cassette" for the expression of an ER-located "dibasic processing endoprotease" means 
a DNA sequencis capable of expressing such a polypeptide and comprises a promoter and a structural gene 
and, If desired, a transcriptional terminator and optionally a transcriptional enhancer, ribosomal binding site 
and/or further regulatory sequences. 

5 Such an expression cassette may contain either the regulatory elements naturally linked with the corre- 

sponding "dibasic processing endoprotease" gene, heterologous regulatory elements or a mixture of both, i.e., 
for example, a homologous promoter and a heterologous terminator region. 

A wide variety of promoter sequences may be employed, depending on the nature of the host cell. Se- 
quences for the initiation of translation are for example Shine-Dalgarno sequences. Sequences necessary for 

10 the initiation and termination of transcription and for stabilizing the mRNA are commonly available from the 
noncoding 5'-regions and 3 -regions, respectively, of viral or eukaryotic cDNAs, e.g. from the expression host. 

Examples for promoters are as above, i.e. yeast TRP1-, ADHi-, ADHII-. CYC1, GAL1/10, CUP1, PH03-, 
or PH05-promoter. or promoters from heat shock proteins.or glycolytic promoters such as glyceraldehyde-3- 
phosphate dehydrogenase (GAP) promoter (including 5' truncated GAP) or a promoter of the enolase, 3-phos- 

16 phogiycerate kinase (PGK), hexokinase, pyruvate decarboxylase, phosphofructokinase. glucose-6-phosphate 
Isomerase, 3-phosphoglycerate mutase, pyruvate kinase, triosephpsphate Isomerase, phosphoglucose iso- 
merase and glucokinase genes, furthermore a-factor promoter and hybrid promoters, such as hybrid PH05- 
GAP or ADH2-GAP promoters or hybrid promoters using heat shock elements. 

Promoters suitable for the expression in mammalian cells are, for example, derived from viruses, e.g. 

20 SV40. Rous sarcoma virus, adenovirus 2, bovine papilloma virus, papovavirus, cytomegalovirus derived pro- 
moters, or are mammalian cell derived promoters, e.g. of the actin, collagen, myosin, or p-globin gene. The 
yeast promoters may be combined with enhancing sequences such as the yeast upstream activating sequenc- 
es (UAS) and the prompters active in mammalian cells may be combined with viral or cellular enhancers such 
as the cytomegalovirus IE enhancers, SV40 enhancer. Immunoglobulin gene enhancer or others. 

25 Enhancers are transcription-stimulating DNA sequences, e.g. derived from viruses such as Simian virus, 

polyoma virus, bovine papilloma virus or Moloney sarcoma virus. or of genomic origin. An enhancer sequence 
may also be derived from the exlrachromosomal ribosomal DNAof Physarum polycephalum . or It may be the 
upstream activation site froni the yeast acid phosphatase PH05 gene, or the yeast PH05, TRP, PH05-GAPDH 
hybrid.or the like promoter. 

3d A host cell of the invention having a "dibasic processing endoprotease" activity In the ER is useful for the 

preparation of correctly processed heterologous proteins. For this purpose an expression cassette for the ex- 
pression of a gene encoding the desired heterologous protein is of course also to be Introduced Into the host 
cell. Such an expression cassette is herein named "production gene". 

Such a production gene comprises a promoter region, a DNA sequence encoding signal peptide which can 

35 be cleaved off by a signal peptidase, a DNA sequence encoding a pro-sequence which can be cleaved off from 
the dedlred heterologous gene product by a "dibasic processing endoprotease", a DNA sequence encoding a 
desired heterologous gene product and/or a transcriptional terminator region and optionally a transcriptional 
enhancer, ribosomal binding site and/or further regulatory sequences. The coding regions for signal peptkJe, 
the pro-sequence and the heterologous protein are attached "in frame", i.e. the signal peptide is after the trans- 

40 lation of the structural gene covalently linked to the N-termlnus of the pro-sequence and the latter is after the 
translation of the gene covalently linked to the N-terminus of the heterologous protein. 

The pro-sequence can be any sequence from a random genomic library of fragments which can act as a 
molecular chaperone, l,e. a polypeptide which In cis or in trans can Influence the formation of an appropriate 
conformation. Preferably, it is a random sequence which allows membrane translocation. In particular prefer- 

45 red is the a-factor prosequence. 

As in the expression cassette described above for the expriession of a "dibasic processing endoprotease", 
a wide variety of regulator sequences may be employed, depending on the nature of the host cell. For example, 
promotere that are shiong and at the same time well regulated are the most useful. Sequences for the initiation 
of translation are for example Shine-Dalgarno sequences. Sequences necessary for the initiation and lermin- 

so ation of transcription and for stabilizing the mRNA are commonly available friom the noncoding 5 -regions and 
3'-regions. respectively, of viral or eukaryotic cDNAs, e.g. from the expression host 

Signal peptides within the meaning of the present invention are presequences directing the translocation 
of the desired polypeptide to the ER, for example the a-factor signal sequence. Further signal sequences are 
known from literature, e.g, those compiled In von Heijne, G., Nucleic Acids Res, 14. 4683 (1986). 

65 Examples for suitable promoters are as above. I.e. yeast TRP1-, ADHI-, ADMII-, CYC1. GAL1/10, CUP1. 

PH03-, or PH05-promoter, or promotere from heat shock proteins.or glycolytic promotere such as glyceralde- 
hyde-3-phosphate dehydrogenase ( GAP) promoter (including 5'truncated GAP) or a promoter of the enolase. 
3-phosphoglycerate kinase (PGK). hexokinase. pyruvate decarboxylase, phosphofructokinase, glucose-6- 

5 
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phosphate Isomerase, 3-phosphoglycerate mutase, pyruvate kinase, tiiosephosphate isomerase. phosphoglu- 
cose isomerase and glucoklnase genes, furthermore a-factor promoter and hybrid promoters, such as hybrid 
PH05-GAP or ADH2-GAP promoters or hybrid promoters using heat shock elements, or promoters derived from 
eukaryotic viruses, e.g. SV40, Rous sarcoma virus, adenovirus 2, bovine papilloma virus, papovavirus, cyto- 

6 megalovlrus derived promoters or mammalian cell derived promoters, e.g. of the actin, collagen, myosin, or 
p-globin gene. The eukaryotic promoters may be combined with enhancing sequences such as the yeast up- 
stream activating sequences (UAS) or viral or cellular enhancers such as the cytomegalovirus IE enhancers, 
SV40 enhancer, immunoglobulin gene enhancer or others. 

The expression cassette encoding the ER-located "dibasic processing endoprotease" and the production 

10 gene may comprise promoters of the same or of different types. For example, they may both be regulated by 
an inducible promoter which allows the concerted expression of the precursor of the heterologous protein and 
of the ER-located "dibasic processing endoprotease' processing it 

In a preferred embodiment of the invention a production gene suitable for the expression in a S. cerevisiae 
cell which cell contains an ER-located "dibasic processing endoprotease", prefeably YAP3pHDEL, more pre- 

15 ferably KEX2pHDELor, most preferably, KEX2psHDEL, comprises a structural fusion gene composed of a DMA 
sequence encoding a yeast pro-sequence which can be cleaved off from the precursor by a yeast "dibasic proc- 
essing endoprotease", preferably the S. cerevisiae a-factor leader sequence and downstream a DNA sequence 
coding for a desired heterologous protein, said fusion gene being under the control of expression control se- 
quences regulating transcription and translation In yeast 

20 The heterologous protein may be any protein of biological interest and of prokaryotic or especially eukary- 

otic, in particular higher eukaryotic such as mammalian (including animal and human), origin and is, for ex- 
ample, an enzyme which can be used, for example, for the production of nutrients and for performing enzymatic 
reactions in chemistry or molecular biology, or a protein which is useful and valuable for the treatment of human 
and animal diseases or for the prevention thereof, for example a hormone, polypeptide with immunomodula- 

25 tory, anti-viral and anti-tumor properties, an antibody, viral antigen, blood clotting fector, a fibrinolytic agent, 
a growth regulation factor, furthermore a foodstuff and the like. 

Example of such proteins are e.g. hormones such as secretin, thymosin, relaxin. calcitonin, luteinizing hor- 
mone, parathyroid hormone, a drenocortico tropin, melanocyte-stlmulating hormone, p-llpotropln, urogaslrone. 
Insulin, growth factors, such as epidermal growth factor (EGF). insulin-like growth factor (IGF), e.g. IGF-1 and 

30 IGF-2, mast cell growth factor, nerve growth factor, glia derived nerve cell growth factor, platelet derived growth 
factor (PDGF), or transforming growth factor (TGF), such as TGFp. growth hormones, such as human or divine 
growth hormones, interleukin, such as interleukin-1 or-2. human macrophage migration inhibitory factor (MIF), 
interferons, such as human a-lnterferon. for example interferon-oA. aB, aD or aF. p-interferon. y-interferon 
or a hybrid interferon, for example an oA-aD- or an aB-aD-hybrid interferon, especially the hybrid interferon 

35 BDBB, proteinase inhibitors such as at-antib-ypsin. SLPI and the like, hepatitis virus antigens, such as hep- 
atitis B virus surface or core antigen or hepatitis A virus antigen, or hepatitis n6nA=nonB antigen, plasminogen 
activators, such as tissue plasminogen activator or urokinase, hybrid plasminogen activators, such as K2tuPA, 
tick anticoagulant peptide (TAP), tumour necrosis factor, somatostatin, renin, Immunoglobulins, such as the 
light and/or heavy chains of immunoglobulin P. E or G, or human-mouse hybrid immuno-globulins, immuno- 

40 globulin binding factors, such as immunoglobulin E binding factor, human calcitonin-related peptide, blood clot- 
ting factors, such as factor IX or VIIIc. platelet factor 4. erythropoietin, eglin. such as eglin C, desulfatohirudin, 
such as desulfatohirudin variant HV1, HV2 or PA, corlicostatin, echislatln, cyslatins, human superoxide dis- 
mutase, viral thymidin kinase, p-lactamase or glucose Isomerase. Preferred are human a-lnterferon e.g. In- 
terferon aB. or hybrid interferon, particularly hybrW interferon BDBB (see EP 205,404), human tissue plasm^ 

45 nogen activator (t-PA), human single chain urokiase-type plasminogen activator (scu-PA). hybrid plasminogen 
activator KjtuPA (see EP 277,31 3), human calcitonin, desulfatohirudin, e.g. vlariant HV1 , even more preferred 
insulin-related 

proteins such as insulin, relaxin, the even more preferred 

insulin-like growth factor II and, in particular, insu- 
so lln-like growth factor I. Proteins containing a pair of basic amino acids, such as Arg-Arg, Lys-Arg, Lys-Lys and 
Arg-Lys, exposed on the protein surface and therefore amenable to proteolytic cleavage, are not suited for 
the process according to the invention and will have to be mutated such that one. of the consecutive basic 
amino acids is replaced by another non-basic amino acid without affecting the biological activity. 

A production gene needs not necessarily be located on the same vector molecule as the gene encoding 
55 the ER-located "dibasic processing endoprotease". In the case the latter is located on a vector whteh Is ex- 
trachromosomaily maintained. It may, be advantageous if the production gene Is located on the same vector 
molecule. 

Expression vectors suitable for the expression of a production gene are, for example, also those which 

6 
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are described above as being suitable for the expression of an ER-located "dibasic processing endoprotease't 
l.e. vectors derived from any vector useful In the art of genetic engineering, such as from viruses, plasmlds or 
chromosomal DNA, such as derivatives of SV40, Herpes-viruses, Papilloma viruses, Retroviruses, Baculo- 
vlrus, or derivatives of yeast plasmlds, e.g. yeast 2^ plasmld. Preferred are vectors for replication and expres- 
6 slon in cerevisiae. 

Preferably, the hybrid vectors of the present invention also contain a replication origin and a marker gene 
for a bacterial host, especially E. coll . so that the construction and the cloning of the hybrid vectors and their 
precursors can be carried out in coil, 

A process for the preparation of heterologous biologically active protein comprising the use of a host cell 
10 having a "dibasic processing endoprotease" activity In the ER according to the invention comprises (a) b-ans^ 
forming a suitable host ceil with a hybrid vector comprising an expression cassette encoding an ER-focated 
"dibasic processing endoprotease" and with a hybrid vector encoding a production gene, or (b) transforming 
a suitable host cell with a hybrid vector comprising both an expression cassette encoding an ER-located "di- 
basic processing endoprotease' and a production gene, or (c) transforming a suitable host cell which is stably 
IS transformed with a gene encoding an ER-located "dibasic processing endoprotease" with a hybrid vector en- 
coding a production gene, culturing the transformed host cells under conditions In which the gene encoding 
the ER-located "dibasic processing endoprotease" and the production gene are expressed, and isolating the 
desired heterologous polypeptide from the culture medium according to conventional methods. 

The invention preferentially concerns a process wherein a yeast strain, more preferably a Saccharomyces 
20 cerevisiae strain, e.g. AB110 or AB110 kex2", an ER-located yeast "dibasic processing endoprotease", e.g. 
YAP3DDEL or, preferably, YAP3HDEL or. more preferably, KEX2pHDEL, most preferably KEX2psHDEL, is 
used for the preparation of an insulin-iilce protein, preferably iGF-2 and, more preferably. IGF-1, which is pro- 
duced as a precursor containing the a-factor leader sequence. 

The transformation Is accomplished by methods knov/n in the art. for example, according to the method 
2S described by Hinnen et al [Proc. Natl. Acad Sci. USA 75, 1919(1978)]. This method can be divided into three 
steps: 

(1) Removal of the yeast cell wail or parts thereof. 

(2) Treatment of the "naked" yeast cells (spheroplasts) with the expression vector in the presence of PEG 
(polyethyleneglycol) and Ca^ ions. 

50 (3) Regeneration of the cell wall and selection of the transformed cells In a solid layer of agar. 

The transformed host cells are cultured by methods known in the art in a liquid medium containing assim- 
ilable sources of carbon, nitrogen and Inorganic salts. Various sources of carbon can be used for culture of the 
transformed yeast cells according to the invention. Examples of preferred sources of carbon are assimilable 
carbohydrates, such as glucose, maltose, mannitol or lactose, or an acetate, which can be used either by Itself 

35 or in suitable mixtures. Examples of suitable sources of nitrogen are amino acids, such as casaminoacids, pep- 
tides and proteins and their degradation products, such as tryplone. peptone or meat extracts, yeast extracts, 
malt extract and also ammonium salts, for example ammonium chloride, sulfate or nitrate, which can be used 
either by themselves or in suitable mixtures, inorganic salts which can also be used are. for example, sulfates, 
chlorides, phosphates and carbonates of sodium, potassium, magnesium and calclum.The medium further- 

40 more contains, for example, growth-promoting substances, such as trace elements, for example iron, zinc, 
manganese and the like, and preferably substances which exert a selection pressure and prevent the growth 
of cells which have lost the expression plasmld. Thus, for example, if a yeast strain wh|ch Is auxotrophic In, 
for example, an essential amino acid, is used as the host microorganisrh, the jalasmid preferably contains a 
gene coding for an enzyme which complements the host defect Cultivation of the yeast strain is performed 

45 in a minimal medium deficient In said amino acM. 

Culturing is effected by processes which are known in the art. The culture conditions, such as temperature, 
pH value of the medium and fermentation time, are chosen such that a maximum titre of the heterologous pro- 
teins prepared according to the invention is obtained. Thus, the yeast strain is preferably cultured under aerobic 
conditions by submerged culture with shaking or stirring at a temperature of about 20 to 40**C, preferably about 

50 SO^'C. and a pH value of 5 to 8. preferably at about pH 7. for about 4 to al)out 96 hours, preferably until maximum 
yields of the proteins of the invention are reached. The culture medium Is selected In such a way that selection 
pressure is exerted and only those cells survive which still contain the hybrid vector DNA including the genetic 
marker. Thus, for example, an antibiotic Is added to the medium when the hybrid vector includes the corre- 
sponding antibiotic resistance gene. 

w When the cell density has reached a sufficient value culturing Is Interrupted and the medium containing 

the product is separated from the cells which can be provided with fresh medium and used for continuous pro- 
duction. The protein can also accumulate within the cells, especially In the periplasmic space, in the latter case 
the first step for the recovery of the desired protein consists In liberating the protein from the cell Interior. The 
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cell wall Is first removed by enzymatic digestion with glucosldases or, alternatively, the cell wall is removed 
by treatment with chemical agents, Le. thiol reagents or EDTA, which give rise to cell wall damages permitting 
the produced protein to be released. The resulting mixture is enriched for heterologous protein by conventional 
means, such as removal of most of the non-proteinaceous material by treatment with polyethyleneimine, pre- 

6 cipitation of the proteins using ammonium sulphate, gel electrophoresis, dialysis, chromatography, for exam- 
ple, ion exchange chromatography (especially preferred when the heterologous protein includes a large num- 
ber of acidic or basic amino acids), size-exclusion chromatography. HPLC or reverse phase HPLC, molecular 
sizing on a suitable Sephadex® column, or the like. The final purification of the pre-purifled product is ach- 
ieved, for example, by means of affinity chromatography, for example antibody affinity chromatography, es- 

10 pecially monoclonal antibody affinity chromatography using antibodies f bced on an Insoluble matrix by method$ 
known In the art 

Recombinant DNA molecules 

16 The invention also concerns a recombinant DNA molecule encoding an expression cassette for an ER- 

located "dibasic processing endoprotease" as defined above. The invention further concerns hybrid vectors 
comprising such a recombinant DNA molecule. 

The present invention preferably concerns a recombinant DNA molecule or hybrid vector comprising the 
coding region for KEX2p, preferentially for a soluble KEX2p variant, most preferably for KEX2ps shown In the 

20 sequence listing under SEQ ID No. 1. and for an ER retention signal, preferentially for the HDEL sequence 
shown In the sequence listing under SEQ ID No. 2. The coding sequence for the ER retention signal is pref- 
erentlally located In downstream direction of the KEX2p coding region. A KEX2p with HDEL attached at the 
C-terminus Is herein named KEX2pHDEL, the corresponding structural gene is KEX2HDEL. 

As mentioned above some soluble KEX2p variants are known from the literature. Further deletion mutants 

25 according to the Invention can be prepared using methods known in the art,;for example by preparing a cor- 
responding DNA coding for said mutant, inserting it in a suitable vector DNA under the control of an expression 
control sequence, transforming suitable host microorganisms with the expression vector formed, culturing the 
transformed host microorganism in a suitable culture medium and isolating the produced mutant The DNA cod- 
ing for any of said mutants can be produced for example, by taking a plasmid containing the DNA coding for 

3d KEX2p and (1) digesting It with a restriction enzyme which cleaves within or 3'of the DNA regfon coding for 
the membrane binding site (for example, EcoRI, BstXI or Narl), digesting the cleaved DNA with a suitable en- 
donuclease, for example Bal31 , such thatsald DNA region is removed and reclrcularizing the linearized plasmid 
by blunt end ligation or the like, or (2) choosing or creating (for example by site-directed mutagenesis) one 
restriction site 5' to and one restriction site 3' to the DNA region coding for the membrane binding site (for ex- 

35 ample Pvull and Nari or EcoRI; the 3' restriction site may also be located within the plasmid DNA adjacent to 
the translatton stop signal of the KEX2 gene), digesting the plasmid with two restriction enzymes recognizing 
said restricting sites and reclrcularizing the linearized plasmid by blunt end ligation or the like, or (3) deleting 
the DNA region coding for the membrane binding site by using loop-out mutagenesis, or (4) totally deleting 
the C-terminus by digesting with Pvull in the case of KEX2 and recircularizing the linearized plasmid by blunt 

40 end ligation or the like. As the DNA sequences of KEX2 are known (K. MIzumo et al.supra) a suitable mutagenic 
oligonucleotide can easily be devised and used to delete said DNA region applying the M13 cloning system. 
Care must be taken that the mutated KEX2 genes are linked with a DNA sequence encoding a yeast ER re- 
tention signal. Such a DNA sequence caii be Introduced at the desired place via a synthetic linker DNA or it 
may be provided by the adjacent vector DNA. Preferentially, the mutated KEX2 genes Include at their 3' ends 

45 codons which code for the HDEL sequence defined above. All of these methods make use of conventional 
techniques. 

Within the scope of the present invention are also recombinant DNA molecules comprising DNAsequences 
which are degenerate within the meaning of the genetic code to the DNAsequences with SEQ ID No. 1 and 
2, i.e. DNAsequences encoding the same amino acid sequences although nucleotides are exchanged. Such 
50 degenerate DNA sequences may, for example, contain new restriction enzyme cleavage sites. 

Host strains 

Another aspect of the present invention Involves host cells, preferably mammalian, more preferably yeast 
55 even more preferably K. lactis and most preferably S^ cerevisiae cells transformed with a hybrid vector of the 
Invention comprising an expression cassette encoding an ER-located "dibasic processing endoprotease". The 
invention also concerns host cells which are stably transformed with an expression cassette encoding an ER- 
located "dibasic processing endoprotease", i.e. which comprise such a recombinant expression cassette Irite- 
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grated Into a chromosome. 

Suitable hosts for the Integration of an expression cassette encoding KEX2HDEL are e.g. kex2~ mutants 
of yeast preferentially of S. cerevlslae. The method for the preparation of transformed host cells comprises 
transforming host cells with an integration vector consisting of a KEX2pHDEL expression cassette which is 
5 under the control of any constitutive or Inducible promoter, preferably of thQ promoters defined above or of 
the promoter of the KEX2 gene, and selecting stably transformed cells. Stable Integrative transformation is 
state of the art and can be performed, for example, according to the procedured reported for mammalian cells 
In P.L. Feigner et al.. Proc. Natl. Acad. Sci USA 84:7413-7417(1987) or In F.L Graham et aL, Virology 62:456- 
467(1973) and for S. cerevlslae cells in R. Rothslein, Methods Enzymol. 194:281-302(1991). 
. 10 The Invention concerns especially the recombinant DNA molecules, the! hybrid vectors, the transformed 

hosts, the proteins and the methods for the preparation thereof and the method for the preparation of a bio- 
logically active protein as described in the examples. 

The following examples serve to illustrate the Invention but should not be construed as a limitation thereof. 

IS Example 1: Construction of a shortened KEX2 gene encoding soluble KEX2p variant 

In order to get a soluble KEX2p protease activity, a mutant KEX2 gene lacking 600 bp, coding for the C 
terminal 200 amino acids. Is constructed. The truncated gene Is under the control of the KEX2 promoter reach- 
ing from -1 to -502. Translation Is terminated at a stop codon (TAA) originating from the polyllnker of pUC18. 

20 In detail, plasmid pUC19 [Boehringer Mannheim GmbH, FRG] is digested to completion with Hindlll and 

the 2686 bp fragment is Isolated. The ends are filled in and the fragrnent is religated. An aliquot of the ligation 
mbcture is added to calcium-treated, transformation competent E.coll JM101 [Invltrogen, San Diego, USA] cells. 
12 transformed ampicillln resistant E.coll transformants are grown In the presence of 1 00 ^g/m! amplcillln. Plas- 
mid DNA is prepared and analysed by digestion with Hindlll as well as with BamHI. The plasmid lacking the 

25 Hindlll site Is designated pUG19woH. 

A 3207 bp Ball-Ahalll KEX2 fragment (obtainable from total genomic yeast DNA) is provided at both ends 
with BamHI linkers followed by a complete digestion with BamHI. Plasmid pUC19woH Is cut to completion with 
BamHI, the linear 2690 bp fragment is isolated and ligated to the BamHI KEX2 fragment described above. An 
aliquot of the ligation mixture is transformed into E.coll JM101 cells. 12 transformed, amplcillln resistant col- 

30 onies are grown In ampicillln (100 ng/ml) containing LB medium, plasmid DNA is exbiacted and analyzed by 
BamHI digests. One done with the expected restriction fragments Is selected and called pKS301 b (deposited 
asDSM6028). 

The 2 ^im yeast vector pAB24 which corresponds essentially to plasmid pDP34 (deposited as DSM 4473) 
is cut to completion with BamHI and the linear pAB24 fragment Is Isolated. Plasmid pKS301b Is digested with 
as BamHI and the fragment containing the complete KEX2 gene Is Isolated and ligated to the linearized yeast 
vector pAB24.An aliquot of the ligation mixture Is transformed into E.coll JM101 and plasmid DNA of twelve 
positive clones is examined by BamHI digests. One clone with the expected restriction fragments is referred 
toaspAB226. 

Plasmid pKS301 b Is digested to completion with SpHI, Pvull and Seal. The 2.37 kb Sphl-Pvull fragment 

40 containing KEX2 sequences from -502 to +1843 and a part of the pUC19 polyllnker Is isolated. Plasmid pUC18 
[Boehringer Mannheim, FRG] is cut to completion with SphI and Smal. The 2660 bp Sphl-Smal pUCI 8 fragment 
Is ligated to the 2.37 kb Sphl-Pvull KEX2 fragment by Sphl/SphI and Pvull/Smal ligation. The Pvull/Smal lig- 
ation results In the fusion of the KEX2 ORF coding for 614 amino acids to an ORF In the pUC18 sequences 
which codes for 7 addtlonal G-terminal amino acids (-G-V-P-S-S-N-S) and is followed by a stop codon (TAA). 

4S An aliquot of the ligation mixture Is transformed into E.coll JM101 . Plasmid DNA is isolated from ampicillin re- 
sistant E.coll transformants and analyzed by digestion with SphI and EcoRI as well as with Hindlll. One clone 
with the expected restriction pattern is referred to as pl8kexp. In the sequence listing under SEQ ID No. 1 
the ORF encoding the soluble KEX2ps with KEX2-derived DNA is shown. ' 

Plasmid pi 8kexp is cut to completion with Pvull, Sail and Seal. The 2552 bp Sall-Pvull fragment containing 

so the KEX2 sequences reaching from -502 to + 1 843 as well as 206 bp of pUCI 8 sequences Is Isolated. Plasmid 
pDP34 is digested with BamHI and the ends of the linearized plasmid are filled In. After Inactlvatlon of T4 poly- 
merase the linearized filled-ln plasmid is cut with Sail and the 11.78 kb fragm.ent Is isolated. The pDP34 Bam- 
HI*-SaH fragment (BamHI*: f llled-ln BamHI) Is ligated to the 2552 bp Sall-Pvull fragment by Sall/Sall and Bam- 
Hl*/Pvull ligation. An aliquot of the ligation mixture is transformed into transformation competent E.coll JM101 

ss cells. Plasmid DNA is extracted from ampicillin resistant cells and analyzed by restriction analysis with Sail, 
Ncol, Smal, Xbal, EcoRI. One done with the expected rastrlctton fragments Is referred to as pDPkexp. 
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Example 2: .Construction of pDPkexpHDEL 



Plasmid p18kexp (see example 1) consists of the truncated KEX2 gene coding for soluble KEX2p (KEX2p.) 
Inserted into the polyllnker region of pUC18. The DNA sequence coding for the C-termlnal end of KEX2pa In 
6 pISkexp Is followed by an Asp718 and an EcoRI site (see SEQ ID No. 1), The plasmid Is cut with Asp718 and 
EcoRI and is llgated with the hybridized oligonucleotides with SEQ ID No. 11 and 12. encoding the HDEL se- 
quence and two stop codons, resulting in the ligation product p18kexpHDEL. Plasmids p18kexpand p18kexpH- 
DEL can be distinguished by Sad or Sful digestion. The polyllnker Insertion region was squericed In p1 8kexpH- 
DEL. 

10 Plasmid p18kexpHDELwas cut with Sail, Pvull and Seal and the 2572 bp Sall-Pvull fragment was Isolated, 

Plasmid pDP34 was cut with BamHI and the sticky ends were filled in with Klenow polymerase. Afterf illing 

in, the polymerase was destroyed by phenol/chlorophorm and chlorophorm extractions followed by an ethanol 

precipitation. The BamHI cut filled In pDP34 fragment was then digested with Sail and the 11780 bp Sail- 

BamHI* (BamHI*: filled In BamHI site) was Isolated. 
15 The 2572 Sall-Pvull fragment isolated from pISkexpHDEL was ligated with the 11780 bp Sall-BamHI* 

pDP34 fragment. Ligation of Sall/Sall and Pvull/BamHI* led to the plasmid pDPkexpHDEL. 

Example 3: Construction of an yeast vector containing the IGF-1 expression cassette 

20 Plasmid pDP34 is an E. coll - S. cerevislae shuttle vector containing the complete lyi sequence, the yeast 

genomic URA 3 and d LEU2 sequences as selectable markers for yeast, and pBR322 sequences for selection 
and propagation In E. coll [A. Hinnen, B. Meyhack and J. Heini, In Yeast genetic engineering (P.J. Barr, A.J. 
. Brake & P. Valenzuela, eds.. pp. 193-213 (1989). Butterworth Publishers. Stoneham]. A 276 bp Sall-BamHI 
fragment of pBR322 [Boehringer Mannheim GmbH, Germany] is ligated to the Isolated linear vector after dl- 

25 gestion with Sail and BamHI , An aliquot of the ligation mixture is added to calcium-treated transformatton com- 
petent E. coll HB 1 01 cells [Invitrogen, San Diego, USA]. Four transformed ampicillin resistant E. coll transfor- 
mants are grown in the presence of 100 ^g/ml ampicillin, Plasmid DNA Is prepared and analysed by digestion 
with Sall-BamHI. One plasmid having the expected restriction fragments Is referred to as pDP34A. The human 
Insulin-like growth factor-1 (IGF-1) gene expression cassette, for expression in yeast, is ligated into the BamHI 

30 site of pDP34A. The DNA sequence of the expression cassette, 

BamHI BamHI 
GAPDH oFL IGFl oFT 

5- I : 1 1 — h —i3' , 

35 400bp 255bp 216bp 275bp 

is shown under SEQ ID No. 5. It consists of a BamHI-cleavable linker, followed by an about 400 bp fragment 
of the cerevislae glyceraldehyde-3-phosphate dehydrogenase (GAPDH) promoter, then the S. cerevislae 
a-factor leader sequence encoding the first 85 amino acids of the a-factor precursor (J. Kuijan et al.. Cell 

40 30:933-943, 1982), directly followed by a chemically synthesaed IGF-1 gene [G.T. Mullenbach. AL. Choo. M.S. 
Urdea P.J. Barr. J.P. Merrywealher, A J. Brake, and P. Valenzuela, Fed. Proc. 42. 434 (abslr.) (1983)]. the about 
275 bp S. cerevislae a-factor terminator (aFT; Kurian et aL, Cell 30:933-943, 1982) and a second BamHI-deav- 
able linker. An aliquot of the ligation mixture Is transformed In E. coll HB101. Plasmid DNA from 6 Independent 
transformants Is analysed with Sail as well as BamHI . One clone with the promoter of the expression cassette 

45 oriented 3' to the Sall-BamHI fragment Is named pDP34A/GAPDH-aFL-IGF1-aFT. 



Example 4: Construction of two mutated a-factor leader sequences 



A 1146 bp BamHI fragment, consisting of the 400 bp GAPDH promoter, the 255 bp aFL sequence, the 
60 216 bp chemically synthesized IGF-1 gene (IGF-1 gene and 2 stop codons) and the 275 bp aFT, released from 
pDP34A/GAPDH-aFL-IGF1raFT (see example 3). It is llgated to BamHI digested, bacterial alkaline phospha- 
tase (GIbco-BRL, Basel. Switzerland) treated replicatlve form (RF) of phage vector Ml 3mp1 8 (Boehringer Man- 
nheim GmbH, Germany). An aliquot of the ligation mixture Is transformed in E. coli JMIOI . Plasmid DNA from 
6 plaques is analysed with EcoRI, BamHI. and BamHI-Sall. One RF clone with the appropriate restriction frag- 
55 ments and with the promoter directly adjacent to the EcoRI site of the vector is selected and called mpl 8/Bam- 
HI/GAPDH-aFL-IGF1-aFT. Site-directed mutagenesis using the two-primer protocol [M.J. Zoller and M. Smith, 
Meth. in Enzymol. 154. 329-350 (1987)] employing the mutagenic oligodesoxyribonudeotide primer with SEQ 
ID No. 6 gives a new sequence of the aFL. changing the amino acids Ala^o to Asp^o and Pro^no Leu^i. Single- 
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stranded DNA obtained from one positive clone after hybridization with the radioactlvely labelled mutagenic 
primer is sequenced [F. Sanger. S. Nicklen and A.R. Coulsen, Proc, Natl. Acad. Scl. U.S.A. 74. 5463-6467 
(1977)] to confirm the desired mutations. The mutated aPL sequence Is nafned aFLMut2 and the resultant 
phage Is called mp18/BamHI/GAPDH-aFLMut2-fGF1-aFT. 
6 Site-directed mutagenesis using the four mutagenic ollgodesoxyiibonudiBotide primers with SEQ ID No. 

7. 8, 9 and 10 yields an aFL sequence In which the following amino acids are exchanged: 
Alai3 to Asni3. Gln^^ to Asn^, Pro^ to Jhr^ , Giy^ to Asn*>. Lys^e to Asn^^, and Glu^fl to Thr^*. 
DNA sequencing on single-stranded DNA template confirms all mutation^. 

The mutated aFL sequence Is named aFLG1G2G3G5 and the phage is rBferred to as mplB/BamHI/GAPDH- 
10 aFLG1G2G3G5-IGF1-aFT. 

Example 5: Construction of yeast vectors containing GAPDH-aFL-IGFI-aFT, GAPDH-aFLMut2-IGF1-aFT , 
and GAPDH-aFLG1G2G3G5-IGF1-aFf ~~- " ; 

16 To create an unique Bglll site In the vector pDP34A (see example 3), piasmid DNA Is digested to completion 

. with Sad and the 3' overhang is flushed with T4 DNA polymerase (New England BioLabs, Beverly, MA. USA), 
The linearized blunt-ended vector pDP34A Is ligated to Bglll linkers (Boehrlnger Mannheim GmbH. Germany). 
After linker ligation, the vector DNA is digested with Bglll and then rellgated. Plasmid DNA of 6 ampiciilln re- 
sistant transfer mants, obtained after transformation of an aliquot of the rellgated mixture in E, coll HB101, is 

20 analysed with restriction enzymes Bglll-Sall and Bglil-Scal. One clone with the expected restriction fragments, 
confirming the creation of a Bglll site In place of the Saci site, is designated as pDP34B. 

pDP34B is digested to completion with BamHI and is treated with bacterial alkaline phosphatase. This 
linearized vector DNA is used to subclone the 1146 bp BamHI fragments obtained fnm pDP34A/GAPDH-aFL-IGFi- 
aFT (see example 3). mp 18/BamHI/GAPDH-aFLMut2-IGF1 kiFT (see example 4) and mp18/BamHI/GAPDH- 

26 aFLG1G2G3G5-IGF1-aFT (see example 4). After ligation, an aliquot from each of the three llgatton mixtures 
is transformed In E. coii HB1 01 . Plasmid DNA of four individual transformants from each of the three ligations 
are analysed by Sail to determine the orientation of the BamHI fragments with respect to the Sall-BamHI 
pBR322 fragment. Plasmlds yielding a 1 147 bp fragment, with the pBR322 DNA at the 5' end of the promoter, 
are chosen and are named pDP34B/BamHI/GAPDH-aFL-iGF1-aFT, pDP34B/BamHI/GAPDH-aFLMut2-IGF1- 

30 aPT, andpDP34B/BamHI/GAPDH-aFLG1G2G3G5-IGF1-dFT- 

Exampie 6: Construction of yeast vectora which contain, on the same piasmid.expresslon cassettes for 
KEX2p and for iGF-1 with the wild type g-factor leader secretion signal 

The yeast vector pDP34B (example 5) Is digested to completion with Bglll and treated with bacterial alka- 
line phosphatase. Plasmid pKS301 b (example 1) is digested with BamHI and the - 3210 bp fragment con- 
taining the complete KEX2 gene Is isolated and ligated to the linearized vector pDP34B. An aliquot of the lig- 
ation mixture is transformed into E coM HB1 01 and plasmid DNA of four transformants is examined by restric- 
tion analysis with BamHI and Bglll. One done with the expected restriction fragments Is known as 
PDP34B/KEX2. 

pDP34B/KEX2 is digested to completion with BamHI and treated with bacterial alkaline phosphatase. 
A 1146 bp BamHI fragment containing the IGF-1 expression cassette Isolated from pDP34A/GAPDH-aFL- 
IGFI-aFT (example 3) Is ligated to linearized vector pDP34B/KEX2. After transformation, plasmid DNA of four 
clones is analysed with Sail and BamHI-Bglll. One done, with the promoter In the IGF-1 expression cassette 
3* to the pBR322 Sall-BiamHI fragment and the KEX2 gene In the opposite orientatton to the IGF-1 cassette, 
Is chosen and is named, pbP34B/KEX2/GAPDH-aFL-IGF1-aFT : 

Example 7: Construction of yeast vectors which contain, on the same plasmid, expression cassettes for 
KEX2p, and for IGF-1 with the wild type a-factor leader secretion signal 

After digestion of plasmid pDPkexp (example 1) with Smal. BamHI linkers [Boehringer Mannheim GmbH, 
Germany] are added, followed by digestion with BamHI and Seal which allows Isolation of a - 2660 bp BamHI 
fragment This is ligated to linearized pDP34B. Analysis of plasmid DNA of transformants with BamHI and Bam- 
HI-Bglll yields one done with the expected restriction fragments which Is named pDP34B/kexp. 

The IGF-1 expression cassette Is subcloned in the BamHI site ofppP34B/kexp in the same way as In ex- 
ample 6. Restriction analysis with Sail and BamHI-Bglll yields different dones with the promoter of the IGF- 
1 expression cassette 3' to the pBR322 Sali-BamHI fragment and the soluble KEX2 in the opposite orientation 
to the IGF-1 cassette. One such done is chosen and is named pDP34B/kexp/GAPDH-aFL-IGF1-aFT. 
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Example 8: Construction of yeast vectors which contain, on the same piasmid, expression cassettes for 
KEX2p«HDEL and for IGF-1 with the wild type a-factor leader secretion signal 

pDPI^expHDEL (see example 2) is digested with BamHI. and after Isolation of the about 2680 bp long frag- 
ment it is ligated to linearized pDP34B. Piasmid DNAof E. coH HB101 transformants are analysed with BamHI- 
Bglll. One done with the expected restriction fragments is named pDP34B/kexpHDEL. 

The IGF-1 expression cassette is subcloned in the BamHI site of pDP34B/kexpHDELin the same way as 
in example 6. Piasmid DNAof amplcillin resistant E. coll HB1 01 transformants is analysed with Sail and BamHI- 
Bglll. One clone with the promoter of the IGF^1 expression cassette 3' to the pBR322 Sall-BamHI fragment 
and the soluble KEX2HDEL In the opposite orientation to the IGF-1 cassette is referred to as pDP34B/kex- 
2pHDEL/GAPDH-aFL.|GF1-aFT. 



Example 9: Construction of plasmlds pPP34B/KEX2/GAPDH-aFLMut2>IGF1-aFT, pDP34B/kexp/GAPDH- 
aFLMut2-IGF1-aFT. pDP34B/kexpHDEL/GAPDH-aFLMut2-iGF1 -aFT, pDP34B/KEX2/GAPDH- '' 
aFLG1G2G3G5-IGF1-aFT. pDP34B/kexp/GAPDH-aFLG1G2G3G5-IGF1-aFT, and pDP34B /kexpHDEL7 
GAPDH-aFLG 1 G2G3G5-iGF1 -aFT ' ' 



20 



26 



30 



35 



40 



These plasmlds are constructed In a way similar to the procedures detailed in examples 8, 7 and 8. The 
expression cassettes. BamHI fragments of GAPDH-aFLMut2-IGF1.aFT and GAPDH-aFLGI G2G3G5, are iso- 
lated from pDP34B/BamHI/GAPDH-aFLMut2-IGF1-aFT (see example 5) and pDP34B/BamHi/GAPDH. 
aFLG1G2G3G5-IGF1-aFT (see example 5) and subcloned in yeast vectors already containing KEX2, or solu- 
ble KE>^ or soluble KEX2HDEL gen 

Example 10: Construction of a kex2- mutant of the yeast strain AB110 

pKS301b (example 1) is cut at the unique Bglll site In the KEX2 gene. A - 2920 bp Bglll fragment from 
the piasmid YEp13 [J. Broach et al., Gene 8. 121-133 (1979)] is ligated to the linearized vector pKS301b. An 
aliquot of the ligation mixture Is transformed In E. coli HB101. Piasmid DNAfrom twelve amplcniln resistant 
transformants are analysed with Hindlll-EcoRI. One clone with the expected fragments Is referred to as 
pUC19/kex2::LEU2. This piasmid has the coding sequence of the KEX2gene disrupted by t he functional LEU2 
gene. pUC19/kex2::LEU2 is digested with BamHI to release the linear kex2::LEU2 fragment. The yeast strain 
AB110 Is used for transformation (example 11) with the linearized DNA Transformants are selected forleucine 
protolrophy. Genomic DNAof four LEVT transformants are digested by EcoRI-Hindlll. To confirm that the gen- 
omic copy of KEX2 Is Indeed disrupted by the LEU2 gene. Southern blot analysis is performed. One yeast 
transformant with the expected restriction fragments is named AB110 kex2". 

Example 11: Transformation of S, cerevislae sfralns AB11Q and AB110 kex2" 

Yeast transformation is carried out as described by Klebe et al. [Gene 25, 333-341 (1983)]. 
S. cerevislae AB1 10 is transformed (see example 12) with the plasmlds compiled hereinaf terand the b-ans- 
Ibrmants are named as indicated: 





Piasmid 


Transformant Name. 


45 


pDP34B/GAPDH-aFL-IGF1-aFT (example 5) 


ylGI 




pDP34B/KEX2/GAPDH-aFL.|GF1 -aFT (example 6) 


ylG2 




pDP34B/KEX2HDELyGAPDH-aFL-IGF1-aFT (example 8) 


ylG3 


50 


pDP34B/GAPDH-aFLMut2-IGF1-aFT (example 5) 


ylG4 




pDP34B/GAPDH-aFLG1G2G3G5.|GF1-aFT (example 5) 


ylG 5 



55 



Three colonies of each of the transformants are selected and designated with an additional number (viz. 
ylG1-1,ylG1-2,ylG 1-3), 

S. cerevislae AS 11 0 kex2: (see example 1 0) is transformed with the plasmlds compiled hereinafter and 
the transformants are named as indicated: 
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Plasmid 


Transformant Name 




pDP34B/GAPDH-aFL-IGF1-aFT (example 5) 


ylG6 


6 


pDP34B/KEX2/GAPDH-aFL-IGF1-aFT (example 6) 


yIG7 




pDP34B/KEX2/GAPDH-aFLMut2-IGF1-aFT (example 9) 


ylG8 




pDP34B/kexp/6APDH-aFLMut2-IGF1-aFT (example 9) 


ylG9 


10 


pDP34B/kexpHDELyGAPDH.aFLMut2.IGF1-aFT (ex! 9) 


ylG 10 




GpDP34B/KEX2/GAPDH-aFLG1G2G3G5-IGF1-aFT (ex. 9) 


ylG 11 




pDP34B/kexp/GAPDH-aFLG1G2G3G5-IGF1-aFT (ex. 9) 


ylG 12 


IS 


pDP34B/kexpHDELyGAPDH-aFLG1G2G3G5-IGF1-aFT (ex. 9) 


ylG13 



Three colonies of each of the transformants are selected and designated with an additional number (viz. 
ylG6.1,ylG6-2, ylG6-3). 



20 Example 12 : Growth of yeast transformants In shake-flask cultures and quantitative/qualitative determina- 
tion of iGF-1 protein by high performance liquid chromatography (HPLC) and Western blots 

S. cerevislae AB110 (Mata. his 4-580, Ieu2, ura 3-52, pep 4-3. [cir*]) Is described ielswhere [P.J. Barr et 
ai., J. Biol. Chem. 263. 16471-16478 (1988)]. A rich medium containing 6-5 g/l yeast extract, 4-5 g/l casamino 
25 acids and 30 g/1 glucose is used as non-selective prerculture medium. IGF-1 is expressed in the main cuituiB 
which is a uracil-selective medium containing 1-7 g/l yeast nitrogen base supplemented with 30 g/l glucose, 
8-5 g/l casamino acids and the required amino acids. Yeast b-ansformants (see example 11) are grown at 30^'C 
on a rotary shaker at 1 80 rev7min. for 24h in a 20 mf volume of the pre-culture medium and for 72h In a 80 ml 
volume of main culture. 

30 Aliquot of cells are harvested and the secreted, active monomeric IGF-1 molecule In the culture medium 

is measured by HPLC and ELISA [K, Steube et al., Eur. J. Biochem. 198, 651-657 (1991)]. 

Aliquots of grown cultures are cenfrifuged for 2 minutes at 13000 x g. Cells are resuspended in 3x Laemmli 
buffer (6 % SDS, 0-1 5M Tris pH6-8, 6mM EDTA. 30 % glycerol. 0,05 % bromophenol blue] and lysed by vigorous 
shaking with glass beads followed by incubation of the samples for 3 minutes in a boiling water bath. Protein 

3S from the cell lysate are separated by SDS-PAGE using a 15 % polyacrylamide gel [U.K Laemmli, Nature 227, 
680-685 (1970)]. Proteins are electroblotted onto nitrocellulose filters with the aid of a semi-dry blotter [Sar- 
torius GmbH, Germany]. The transferred proteins are detected with anti-IGF-1 antibodies following the pro- 
cedure supplied by the Bio-Rad immune assay kit [Bib-Rad, Richmond, CA, USA]. 

40 Example 13: A comparison of secreted and intracellular IGF>1 protein(s) byHPLC and Western blot from 
transformants ylG 1 , ylG 6, and ylG 7 

Secreted IGF-1 from transformants of plasmid pDP34B/GAPDH-aFL-IGF1-aFT (see example 5) In yeast 
strains AB110 (transformants ylG 1-1, ylG 1-2 and ylG 1-3) and AB110 kex2' (transformants ylG 6-1, ylG 6-2 
45 and ylG 6-3) are compared by HPLC and the results are depicted in Table 1 . 
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Table 1: 



10 



Transformant 


. HPLC litre In mg/I 


ylG i-1 


B 


ylG1-2 


7 


ylG 1-3 


7 


ylG 6-1 


0 


ylG 6-2 


0 


ylG6-3 


0 



Western blot analysis of Intracellular protein from transformants ylG 6-1, ylG 6-2, and ylG 6-3 shows IGF- 
1 where the processing of the aFL has not occurred 

The results Imply that no mature IGF-1 is secreted Into the media from yeast strains which lack a functional 
copy of KEX2 on the chromosome. When a functional copy of KEX2 is reintroduced on a plasmid, eg. 
pDP34/KEX2/GAPDH-aFL-IGF1-aFT (see example 6) into the yeast strain AB110 kex2- (transformants ylG 
7-1, ylG 7-2, and ylG 7-3) secreted IGF-1 Is again observed. 

Example 14 : A comparison of secreted IGF-1 protein by HPLC and ELISA from transformants ylGI, ylG2, 
and ylG3 . . • 

HPLC measures the amount of active, monomeric IGF-1 in the supernatant. ELISA determines the total 
amount of IGF-1-llke species In the supernatant Besides the monomer, ELISA quantifies the amounts of In- 
termoiecular disulfide bridged dimers and muitlmers, malfolded IGF-1, oxidized IGF-1, and other molecules. 
Table 3 shows a comparison of the HPLC titres and ELISA values of secreted IGF-1 from transformants co- 
expressing IGF-1 and KEX2 p (ylGI and ylG 2) and transformants co-expressing IGF-1 and soluble KEX- 
2HDELp (ylG 3). The iiesults are depicted in Table 2. 



Table 2: 



Transformants 


HPLC titres In mg/1 


EUSA values In mg/I 


ylG 1 


9 


98 


ylG2 


8 


92 


ylG3 


9 


27 



^ These results are the average values obtained from 3 individual strains from each of the 3 transformations. 

Co-expression of soluble KEX2HDEL shows that formatbn of molecules other than monomers have been 
drastically reduced. 

Example 15: A comparison of secreted iGF-1 protein from transformants ylG 1 , ylG 4, ylG 8, ylG 9 and ylG 
10 by HPLC analysis - ■ . 

The mutated leader sequence aFLMut2 does not allow secretion of IGF-1 In strain AB110. Glycosylated, 
unprocessed aFL-IGF-1 molecules accumulate Inside the cell. From the nature of the glycosylation (only core- 
glycosylation observed). It Is evident that these molecules have not traversed beyond the endoplasmatic re- 
^ ticulum due to mutations in the aFL sequence. Co-expression of IGF-1 using the aFLMut2 secretin signal, 
along with the three different forms of the KEX2 enzyme. KEX2, soluble KEX2 and soluble KEX2HDEU in 
AB110 kex2'. shows that the soluble KEX2HDEL protein Is different from the other two. 

Western blotanalysisof intracellular IGF-1 -like proteins from transformants ylG 1, ylG4, ylG8, ylG9 and 
ylG 10 reveals that only soluble KEX2HDEL protein releases mature IGF-1 from the intracellular pool. 
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. Example 16: Analysis of secreted IGF-1 protein from transformants ylG 1, ylG 5 , ylG 11, ylG 12 and ylG 13 
by HPLC analysis : ^ 

The mutated leader sequence aFLG1G2G3G5 allows poor secretion of IGF-1 In strain AB110. Unglyco- 
5 sylated, unprocessed aFL-IGF-1 molecules accumulate inside the cell. It appears that these molecules lack 
the signal sequence of the aFL, which signifies that translocation into the ER has occurred. However, entry 
Into the ER has not caused glycosylation of the three possible sequons (Asn-X-Ser/Thr) In the proregion of 
the aFL Co-expression of IGF-1 with three different forms of the KEX2 enzyme (KEX2, soluble KEX2 and 
soluble KEX2HDEL) in AB110 kex2^ shows that the soluble KEX2HDEL protein expressed In ylG 13 is unique 
10 in permitting more mature IGF-1 to be released from the intracellular pool. 

Example 17 : A time course experiment to study the kinetics of secretion of monomeric IGF>1 fix>m yeast 
transformants ylG 1, ylG 5 and ylG 13 

16 The release of the proregion of the aFL from IGF-1 in the ER Instead of In the Golgl may affect the total 

amount of monomeric IGF-1 secreted at different time points. It Is probable that the proregion has a role In 
facilitating export of the unprocessed IGF-1 protein from the ER to the Golgl. To address this possibility, three 
Individual strains from ylG 1 , ylG 5 and ylG 1 3 are grown in shake flasks and the secretion of monomeric IGF- 
1 Is measured by HPLC taking allquots of supernatants from the yeast cultures after 40h. 48h, 60h and 72h. 

20 The average values obtained from three individual strains (e.g. ylG 1-1, ylG 1-2, ylG 1-3, and ylG 5-1. ylG 5- 
2, ylG 5-3. and ylG 13-1, ylG 13-2. ylG 13-3), belonging to each of the three transformations, ylG 1, ylG 5 
and ylG 13, are shown in Table 3. 



Table 3: 



26 



36 



Strain 


Secreted IGF-1 (mg/l) after 


40h 


48h 


60h 


72h 


ylG 1 


2.5 


4 


7 


8.5 


ylG5 


0.8 


1 


1.2 


1.5 


ylG13 


2.5 


4.5 


9.2 


6 



35 



45 



55 



Example 18: Analysis of secreted IGF-1 by Western blots shows appreciable decrease of dime ric forms us- 
fng soluble KEX2HDEL protein ' \ ■ 



Supernatants from ylG 1 and ylGJ3 (example 17) have been analysed by Western blots under non-re- 
^ ducing and reducing conditions. 

At time points 40h, 48h, and 60h the formation of intermolecular disulphlde bridged IGF-1 molecules is 
not observed using soluble KEX2HDEL protein. Only at 72h does one see a negligible amount (barely visible 
on the blot) of dimeric IGF-1. However, strains expressing KEX2p do show dimers at every time point These 
dimers can be reduced by dithtothreitol (DTT) implying that the dimera are indeed disulfide bonded. 



Deposited microorganisms 



The following microorganism strains are deposited according to the Budapest Treaty at the Deutsche 
Sammlung von Mikroorganismen (DSM). Mascheroder Weg 1 b, D-3300 Braunschweig (depositbn dates and 
5Q accession numbers given): 

Escherichia coll JM109/pDP34: March 14. 1988. DSM 447^^ 
Escherichia coll JM101/DKS301b: June 25, 1990, DSM 6028. 
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18 



20 



25 



35 



40 



45 



60 



55 



Sequence Liisting 
SEQIDNo.! 

Sequence type: Polynucleotide with corresponding polypeptide 

Sequence length: 1866 base pairs 

Strandedness: double 

Topology: linear 

Source: yeast genomic DNA 

Immediate expenmental source: E,coli JM101/pKS301b (DSM6028) 
Features: from 1 to 1866 coding region for soluble KEX2 



ATG AAA GTG AGG AAA TAT ATT ACT TTA TGC TTT TGG TGG 39 
Met Lys Val Arg Lys Tyr lie Thr Leu Cys Phe Trp Trp 
1 5 10 

GCC TTT TCA ACA TCC GCT CTT GTA TCA TCA CAA CAA ATT 78 
Ala Phe Ser Thr Ser Ala Leu Val Ser Ser Gin Gin lie 
15 20 25 



30 CCA TTG AAG GAC CAT ACG TCA CGA CAG TAT TTT GCT GTA 117 
Pro Leu Lys Asp His Thr Ser Arg Gin Tyr Phe Ala Val 
30 35 



GAA AGC AAT GAA ACA TTA TCG CGC TTG GAG GAA ATG CAT 156 
Glu Ser Asn Glu Thr Leu Ser Arg Leu Glu Glu Met His 
40 45 50 

CCA AAT TGG AAA TAT GAA CAT GAT GTT CGA GGG CTA CCA 195 
Pro Asn Trp Lys Tyr Glu His Asp Val Arg Gly Leu Pro 

55 60 65 

I . ■ - 

AAC CAT TAT GTT TTT TCA AAA GAG TTG CTA AAA TTG GGC 234 
Asn His Tyr Val Phe Ser Lys Glu Leu Leu Lys Leu Gly 

70 75 
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15 



20 



25 



30 



35 



45 



SO 



85 



AAA AGA TCA TCA TTA GAA GAG TTA CAG GGG GAT AAC AAC 279 
Lys Arg Ser Ser Leu Glu Glu ;Leu Gin Gly Asp Asn Asn 
80 85 go' 

GAC CAC ATA TTA TCT GTC CAT GAT TTA TTC CCG CGT AAC 312 
Asp His lie Leu Ser Val His Asp Leu Phe Pro Arg Asn 
95 100 

GAC CTA TTT AAG AGA OTA CCG GTG CCT GCT CCA CCA ATG 351 
Asp Leu Phe Lys Arg Leu Pro Val Pro Ala Pro Pro Met 
105 110 :, 115 

GAC TCA AGC TTG TTA CCG GTA AAA GAA GCT GAG GAT AAA 390 
Asp Ser Ser Leu Leu Pro Val Lys Glu Ala Glu Asp Lys 
120 125 130 

CTC AGC ATA AAT GAT CCG CTT TTT GAG AGG CAG TGG CAC 429 
Leu Ser lie Asn Asp Pro Leu Phe Glu Arg Gin Trp His 

135 140 

TTG GTC AAT CCA AGT TTT CCT GGC AGT GAT ATA AAT GTT 468 
Leu Val Asn Pro Ser Phe Pro Gly Ser Asp He Asn Val 
145 . 150 155 



CTT GAT CTG TGG TAC AAT AAT ATT ACA GGC GCA GGG GTC 507 
Leu Asp Leu Trp Tyr Asn Asn He Thr Gly Ala Gly Val 
4o 160 165 

GTG GCT GCC ATT GTT GAT GAT GGC CTT GAC TAC GAA AAT 546 
Val Ala Ala He Vai Asp Asp Gly Leu Asp Tyr Glu Asn 
170 175 180 



GAA GAC TTG AAG GAT AAT TTT TGC GCT GAA GGT TCT TGG 585 
Glu Asp Leu Lys Asp Asn Phe Cys Ala Glu Gly Ser Trp 
185 i90 195 
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35 



45 



SO 



65 



GAT TTC AAC GAC AAT ACC AAT TTA CCT AAA CCA AGA TTA 624 
Asp Phe Asn Asp Asn Thr Asn Leu Pro Lys Pro Arg Leu 

200 . 205 

TOT GAT GAC TAG CAT GGT ACG AGA TGT GCA GGT GAA ATA 663 
Ser Asp Asp Tyr ills Gly Thr Arg Cys Ala Gly Glu He 
210 215 220 

GCT GCC AAA AAA GGT AAC AAT TTT TGC GGT GTC GGG GTA 702 
Ala Ala Lys Lys Gly Asn Asn Phe Cys Gly Val Gly Val 
225 230 

GGT TAC AAC GCT AAA ATC TCA GGC ATA AGA ATC TTA TCC 741 
Gly Tyr Asn Ala Lys He Ser Gly He Arg He Leu Ser 
235 240 245 

GGT GAT ATC ACT ACG GAA GAT GAA GCT GCG TCC TTG ATT 780 
Gly Asp He Thr Thr Glu Asp Glu Ala Ala Ser; Leu He 
250 255 260 

TAT GGT CTA GAC GTA AAC GAT ATA TAT TCA TGC TCA TGG 819 
Tyr Gly Leu Asp Val Asn Asp He Tyr Ser Cys Ser Trp 

265 270 



GGT CCC GCT GAT GAC GGA AGA CAT TTA CAA GGC CCT AGT 858 
Gly Pro Ala Asp Asp Gly Arg His Leu Gin Gly Pro Ser 
40 275 280 285 

GAC CTG GTG AAA AAG GCT TTA GTA AAA GGT GTt ACT GAG 897 
Asp Leu Val Lys Lys Ala Leu Val Lys Gly Val Thr Glu 
290 295 



GGA AGA GAT TCC AAA GGA GCG ATT TAC GTT TTT GCC AGT 936 
Gly Arg Asp . Ser Lys Gly Ala He Tyr Val Phe Ala Ser 
300 305 310 
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2S 



30 



3S 



45 



60 



66 



GGA AAT GGT GGA ACT CGT GGT GAT AAT TGC AAT TAG GAG 975 
Gly Asn Gly Gly Thr Arg Gly Asp Asn Cys Asn Tyr Asp 
315 320 325 

GGC TAT ACT AAT TCC ATA TAT TCT ATT ACT ATT GGG GCT 1014 
Gly Tyr Thr Asn Ser He Tyr Ser He Thr lie Gly Ala 

330 335 

ATT GAT CAC AAA GAT CTA CAT CCT CCT TAT TCC GAA GGT 1053 
lie Asp His Lys Asp Leu His Pro Pro Tyr Ser Glu Gly 
340 345 

TGT TCC GCC GTC ATG GCA GTC ACQ TAT TCT TCA GGT TCA 1092 
Cys Ser Ala Val Met Ala Val Thr Tyr Ser Ser Gly Ser 
355 360 

GGC GAA TAT ATT CAT TCG.AGT GAT ATC AAC GGC AGA TGC 1131 
Gly Glu Tyr lie His Ser Ser Asp He Asn Gly Arg Cys 
365 370 375 

AGT AAT AGC CAC GGT GGA ACG TCT GCG GCT GCT CCA TTA 1170 
Ser Asn Ser His Gly Gly Thr Ser Ala Ala Ala Pro Leu 
380 385 390 



GCT . GCC GGT GTT TAG ACT TTG TTA CTA GAA GCC AAC CCA 1209 
Ala Ala Gly Val Tyr Thr Leu Leu Leu Glu Ala Asn Pro 
40 395 400 

AAC CTA ACT TGG AGA GAC GTA CAG TAT TTA TCA ATC TTG 1248 
Asn Leu Thr Trp Arg Asp Val Gin Tyr Leu Ser lie Leu 
405 410 415 



TCT GCG GTA GGG TTA GAA AAG AAC GCT GAC GGA GAT TGG 1287 
Ser Ala Val Gly Leu Glu Lys Asn Ala Asp Gly Asp Trp 
420 . 425 
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eo 



AGA GAT AGC GCC ATG GGG AAG AAA TAC TCT CAT CGC TAT 1326 

Arg Asp Ser Ala Met Gly Lys I*ys Tyr Ser His Arg Tyr 
430 435 440 

GGC TTT GGT AAA ATC GAT GCC CAT AAG TTA ATT GAA ATG 1365 

Gly Phe Gly Lys He Asp Ala His Lys Leu He Glu Met 
445 450 455 

TCC AAG ACQ TGG GAG AAT GTT AAC GCA CAA ACC TGG TTT 1404 
Ser Lys Thr Trp Glu Asn Val Asn Ala Gin thr Trp Phe 

460 465 

TAC CTG CCA ACA TTG TAT GTT TCC CAG TCC ACA AAC TCC 1449 
Tyr Leu Pro Thr Leu Tyr Val Ser Gin Ser Thr Asn Ser 
470 475 480 

ACG GAA GAG ACA TTA GAA TCC GTC ATA ACC ATA TCA GAA 1482 
Thr Glu Glu Thr Leu Glu Ser Val He Thr He Ser Glu 
485 490 

AAA AGT CTT CAA GAT GCT AAC TTC AAG AGA ATT GAG CAC 1521 
Lys Ser Leu Gin Asp Ala Asn Phe Lys Arg He Glu His 
495 500 505 



GTC ACG GTA ACT GTA GAT ATT GAT ACA GAA ATT AGG GGA 1560 
Val Thr Val Thr Val Asp He Asp Thr Glu He Arg Gly 
40 510 515 520 

ACT ACG ACT GTC GAT TTA ATA TCA CCA GCG GGG ATA ATT 1599 
Thr Thr Thr Val Asp Leu He Ser Pro Ala Gly lie He 

525 530 



TCA AAC CTT GGC GTT GTA AGA CCA AGA GAT GTT TCA TCA 1638. 
Ser Asn Leu Gly Val Val Arg Pro Arg Asp Val Ser Ser 
535 540 545 



88 
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GAG GGA TTC AAA GAG TGG ACA TTC ATG tCT GTA GCA CAT 1677 
Glu Gly Phe Lys Asp Trp Thr Phe Met Ser Val Ala His 
550 555 

6 

TGG GGT GAG AAC GGC GTA GGT GAT TGG AAA ATC AAG GTT 1716 
Trp Gly Glu Asn Gly Val Gly Asp Trp Lys lie Lys Val 
10 560 , 565 570 

AAG ACA ACA GAA AAT GGA CAC AGG ATT GAC TTC CAC AGT 1755 
Lys Thr Thr Glu Asn Gly His Arg He Asp Phe His Ser 
575 580 585 



IB 



20 



TGG AGG CTG AAG CTC TTT GGG GAA TCC ATT GAT TCA TCT 1794 
Trp Arg Leu Lys Leu Phe Gly Glu Ser He Asp Ser Ser 

590 595 

2s AAA ACA GAA ACT TTC GTC TTT GGA AAC GAT AAA GAG GAG 1833 
Lys Thr Glu Thr Phe Val Phe Gly Asn Asp Lys Glu Glu 
600 ' 605 610 



30 



40 



45 



80 



65 



GTT GAA CCA GGG GTA CCG AGC TCG AAT TCG TAA 18 66 

Val Glu Pro Gly Val Pro Ser Ser Asn Ser 
615 620 



SEQIDNo.2 

Sequence type: DNA with conesponding peptide 

Sequence length: 12 base pairs 

Strandedness: double 

Topology: linear 

Source: yeast genomic DNA 

Immediate experimental source: synthetic 

Features: coding region for ER retention signal HDEL 

CAC GAC GAA TTA 12 
His Asp Glu Leu 



21 



BNSOOaO: <EP_0546012A1_t_> 



EP 0 548 012 A1 



10 



25 



30 



SEQIDNo.3 
Sequence type: peptide 
Sequence length: 4 amino acids 
Topology: linear 
Source: K. lactis 

Features: ER retention signal DDEL 
Asp Asp Glu Leu 



SEQIDNo.4 

Sequence type: peptide 
Sequence length: 4 amino acids 
Topology: linear 
Source: mammalian cells 
Features: ER retention signal KDEL 



Lys Asp Glu Leu 



SEQIDNo.5 

Sequence Type: DNA with conesponding polypeptide 
Sequence length: 1179 base pairs 
35 Topology: linear 

Strandedness: double 

Original experimental source: pDP34A/GAPDH-aFL-IGFl-aFT 
^ Features: Expression cassette for the expression of IGF-I in yeasL 
from 1 t:o 6: BamHI restriction site 

from 6 to 404: S. cerevlsiae GAPDH promoter 

from 405 to 659: S. cerevisiae a- factor leader 
^ frpm 660 to 869: IGF-I coding region 

from 870 to 876: linker encoding two Stop codons 
from 877 to 1152: S. cerevisiae ar-factor terminator 
from 1153 to 1158: BamHI restriction site 



55 
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. GGATCCCCAG CTTAGTTCAT AGGTCCATTC TCTTAGCGCA 
ACTACAGAGA ACAGGGGCAC AAACAGGCAA AAAACGGGCA 
^ CAACCTCAAT GGAGTGATGC AACCTGCCTG GAGTJ^TGA 

TGACACAAGG CAATTGACCC ACGCATGTAT CTATCTCATT 
TTCTTACACC TTCTATTACC TTCTGCTCTC TCTGATTTGG 
AAAAAGCTGA AAAAAAAGGT TGAAACCAGT TCCCTGAAAT 
10 TATTCCCCTA CTTGACTAAT AAGTATATAA AGACGGTAGG 

TATTGATTGT AATTCTGTAA ATCTATTTCt TAAACTTCTT 
AAATTCTACT TTTATAGTTA GTCTTTTTTT TAGTTTTAAA 
lg ACACCAAGAA CTTAGTTTCG AATAAACACA GATAAACAAA 

CACC ATG AGA TTT OCT TCA ATT TTT ACT GCA GTT TTA 
Met Arg Phe Pro Ser lie Phe Thr Ala Val Leu 
-85 -80 -75 



20 



30 



TTC GCA GCA TCC TCC GCA TTA GCT GCT CCA GTC 
Phe Ala Ala Ser Ser Ala Leu Ala Ala Leu Val 

-70 _65 

AAC ACT ACA ACA GAA GAT GAA ACG GCA CAA ATT 
Asn Thr Thr Thr Glu Asp Glu Thr Ala Gin He 
-60 -55 

.CCG GCT GAA GCT GTC ATC GGT TAG TTA GAT TTA 
Pro Ala Glu Ala Val lie Gly Tyr Leu Asp Leu 
-50 -45 

^ GAA GGG GAT TTC GAT GTT .GCT GTT TTG CCA TTT 

Glu Gly Asp Phe Asp Val Ala Val Leu Pro Phe 

■ -40 ; ■ ■ ■ -35 



35 



45 



00 



TCC AAC AGC ACA AAT AAC GGG TTA TTG TTT ATA 
Ser Asn Ser Thr Asn Asn Gly Leu Leu Phe lie 
-30 -25 . . -20 



85 
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AAT ACT ACT ATT GCC AGC ATT GCT GCT AAA GAA 635 
Asn Thr Thr lie Ala Ser lie Ala Ala Lys Glu 

-15 -10 

GAA GGG GTA GAG CTG GAT AAA AGA GGT CCA GAA 668 

Glu Gly Val Gin Leu Asp Lys Arg Gly Pro Glu 
• -5 1 

ACC TTG TGT GGT GCT GAA TTG GTC GAT GCT TTG 701 

Thr Leu Cys Gly Ala Glu Leu Val Asp Ala Leu 
5 . 10 

CAA TTC GTT TGT GGT GAC AGA GGT TTC TAC TTC (734 

Gin Phe Val Cys Gly Asp Arg Gly Phe Tyr Phe 

15 20 25 

AAC AAG CCA ACC GGT TAC GGT TCT TCT TCT AGA 767 

Asn Lys Pro Thr Gly Tyr Gly Ser Ser Ser Arg 

30 35 

AGA GCT CCA CAA ACC GGT ATC GTT GAC GAA TGT 800 

Arg Ala Pro Gin Thr Gly lie Val Asp Glu Cys . 

40 45 - 



TGT TTC AGA TCT TGT GAC TTG AGA AGA TTG GAA 833 
Cys Phe Arg Ser Cys Asp Leu Arg Arg Leu Glu 
^ 50 55 

ATG TAC TGT GCT CCA TTG AAG CCA GCT AAG TCT 866 
Met Tyr Cys Ala Pro Leu Lys Pro Ala Lys Ser 
60 65 - 



GCT TGA. TAAGTCGACT TTGTTCCCAC TGTACT.TTTA 902 
Ala 
70 
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GCTCGTACAA AATACAATAT ACTTTTCATT TCTCCGTAAA 942 

CAACATGTTT TCCCATGTAA TATCCTTTTC TATTTTTCGT 982 

TCCGTTACCA ACTTTACACA TACTTTATAT AGCTATTCAC 1022 

* TTCTATACAC TAAAAAACTA AGACAATTTT AATTTT6CTG 10 62 

GGTGCCATAT TTCAATTTGT TATAAATTCC TATAATTTAt 1102 

CCTATTAGTA GCTAAAAAAA GATGAATGTG AATCGAATCC 1142 

10 TAAGAGAATT GGATCC 



16 
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SEQ ID No. 6 

Sequence type: DNA 
Sequence length: 31 
Strandedness: single 
20 Topology: linear 

Source: synthetic oligonucleotide 
Immediate experimental source: synthetic 
Features: mutagenic oligodcsoxyribonucleotide piimv 



GTAGTGTTGA CTAGATCTGC TAATGCGGAG G 31 



SEQ ID No. 7 

Sequence type: DNA 
3s Sequence length: 25 bases 

Strandedness: single 
Topology: linear 
Source: synthetic oligonucleotide 

40 . , 

Immediate experimental source: synthetic 

Features: mutagenic oligodcsoxyribonucleotide primer 

^ GCGGAGGATGC GTTGAATAAA ACTGC 25 



SEQ ID No. 8 

Sequence type: DNA 
Sequence length: 28 bases 



55 
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Strandedness: single 
•Topology: linear 
5 Source: synthetic oligonucleotide 

Immediate experimental source: synthetic 

Features: mutagenic oligodesoxyribonucleotide primer 

10 

CAGCTTCAGC AGTAATGTTT GCCGTTTC 

IS S£QmNo.9 

Sequence type: DNA 

Sequence length: 21 bases 
■ Strandedness: sinele 

20 ^ 

Topology: linear 

Source: synthetic oligonucleotide 

Immediate exi)erimental source: synthetic 

Features: mutagenic oligodcsoxyribpnuclcotide primer 

ATCTAAGTAG TTGATGACAG C 21 



SEQ ro No, 10 

Sequence type: DNA 

Sequence length: 30 bases 

Strandedness: single 

Topology: linear 

Source: synthetic oligonucleotide 

Immediate experimental source: synthetic 

Features: mutagenic oligodesoxyribonucleotide primer 

GCTGTACCCC GGTTTCGTTA GCAGCAATGC 



SEQ ID No. 11 

Sequence type: DNA 
Sequence length: 30 bases 
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Strandedness: single 

Topology: linear 

Source: synthetic oligonucleotide 

Immediate experimental source: synthetic 

Features: oligodesoxyiibonucleotide encoding -HDEL" and two stop codons; includes 
Sfulsite. 

GTACCGTTCG AACACGACGA ATTATAATAG 30 



SEQIDNo.12 

Sequence type: DNA 
Sequence length: 30 bases 
Strandedness: single 
Topology: linear 
Source: synthetic oligonucleotide 
25 Immediate experimental source: synthetic 

Features: oligodesoxyribonucleotide hybridizing with HDEL encoding oliginucleotide of 
SEQ ID No. 1 1 ; includes Sftil site. 



AATTCTATTA TAATTCGTCG TGTTCGAACG 30 



Claimd 



1. A process for the preparation of heterologous protein deaved off a pro-sequence in the host cell, said 
process comprising the use of a host cell having a "dibasic processing endoprotease" activity In the ER. 

2. A process according to claim 1 comprising the use of a yeast host cell having an ER-located KEX2 or YAPS 
protease. 

3. A process according to claim 1 comprising the use of a yeast host cell liaving an ER-located KEX2 pro- 
tease. 

4. A process according to claim 1 for the preparation of heterologous biologically active protein cleaved off 
a yeast a-factor pro-sequence. 

5. . A process according to claim 1 for the preparation of an insulin-related protein. 

6. A recombinant DNA molecule encoding an expression cassette for an ER-located "dibasic processing en- 
doprotease". 

7. A recombinant DNA molecule according to claim 6 encoding an ER-located "dibasic processing endopro- 
55 tease" consisting of a "dibasic processing endoprotease" and an ER-retention signal. 

8. A recombinant DNA molecule according to daim 6 encoding an ER-located "dibasic processing endopro- 
tease" consisting of a soluble 'dibasic processing endoprotease" and an ER-retention signal. 

27 
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9. A recombinant DNA molecule according to daim 7 encoding a protein selected from the group consisting 
of KEX2pHDEL, KEX2p,HDEL and YAP3HDEL 

10. A hybrid vector comprising a recombinant DNA molecule according to claim 6. ' 

11. A host cell transformed with a hybrid vector according to dalm 10. 

12. A host cell according to claim 11 which is stably transformed. 

13. A process for the preparation of a recombinant DNA molecule according to dalm 6. 

14. A process for the preparation of a host ceil according to daim 11. 



20 
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